NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Chen, Xuxi; Yang, Yu; Wang, Zhangyang; Mirzasoleiman, Baharan (May 2024, International Conference on Learning Representations (ICLR))

Full Text Available
Learning to Optimize Differentiable Games

Chen, Xuxi; Vadori, Nelson; Chen, Tianlong; Wang, Zhangyang (July 2023, International Conference on Machine Learning)

Many machine learning problems can be abstracted in solving game theory formulations and boil down to optimizing nested objectives, such as generative adversarial networks (GANs) and multi-agent reinforcement learning. Solving these games requires finding their stable fixed points or Nash equilibrium. However, existing algorithms for solving games suffer from empirical instability, hence demanding heavy ad-hoc tuning in practice. To tackle these challenges, we resort to the emerging scheme of Learning to Optimize (L2O), which discovers problem-specific efficient optimization algorithms through data-driven training. Our customized L2O framework for differentiable game theory problems, dubbed “Learning to Play Games" (L2PG), seeks a stable fixed point solution, by predicting the fast update direction from the past trajectory, with a novel gradient stability-aware, sign-based loss function. We further incorporate curriculum learning and self-learning to strengthen the empirical training stability and generalization of L2PG. On test problems including quadratic games and GANs, L2PG can substantially accelerate the convergence, and demonstrates a remarkably more stable trajectory. Codes are available at https://github.com/VITA-Group/L2PG.
more » « less
Full Text Available
M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

Yang, Junjie; Chen, Xuxi; Chen, Tianlong; Wang, Zhangyang; Liang, Yingbin (April 2023, International Conference on Learning Representations (ICLR))

Full Text Available
M-L2O: Towards generalizable learning-to-optimize by test-time fast self-adaptation

Yang, Junjie; Chen, Xuxi; Chen, Tianlong; Wang, Zhangyang; Liang, Yingbin (January 2023, International Conference on Learning Representations (ICLR))

Full Text Available
Back Razor: Memory-Efficient Transfer Learning by Self-Sparsified Backpropagation

Jiang, Ziyu; Chen, Xuxi; Huang, Xueqin; Du, Xianzhi; Zhou, Denny; Wang, Zhangyang (December 2022, Advances in neural information processing systems)

Transfer learning from the model trained on large datasets to customized downstream tasks has been widely used as the pre-trained model can greatly boost the generalizability. However, the increasing sizes of pre-trained models also lead to a prohibitively large memory footprints for downstream transferring, making them unaffordable for personal devices. Previous work recognizes the bottleneck of the footprint to be the activation, and hence proposes various solutions such as injecting specific lite modules. In this work, we present a novel memory-efficient transfer framework called Back Razor, that can be plug-and-play applied to any pre-trained network without changing its architecture. The key idea of Back Razor is asymmetric sparsifying: pruning the activation stored for back-propagation, while keeping the forward activation dense. It is based on the observation that the stored activation, that dominates the memory footprint, is only needed for backpropagation. Such asymmetric pruning avoids affecting the precision of forward computation, thus making more aggressive pruning possible. Furthermore, we conduct the theoretical analysis for the convergence rate of Back Razor, showing that under mild conditions, our method retains the similar convergence rate as vanilla SGD. Extensive transfer learning experiments on both Convolutional Neural Networks and Vision Transformers with classification, dense prediction, and language modeling tasks show that Back Razor could yield up to 97% sparsity, saving 9.2x memory usage, without losing accuracy. The code is available at: https://github.com/VITA-Group/BackRazor_Neurips22.
more » « less
Full Text Available
Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

Chen, Tianlong; Chen, Xuxi; Ma, Xiaolong; Wang, Yanzhi; Wang, Zhangyang (July 2022, International Conference on Machine Learning)

Full Text Available
HotProtein: A Novel Framework for Protein Thermostability Prediction and Editing

Chen, Tianlong; Gong, Chengyue; Diaz, Daniel J; Chen, Xuxi; Wells, Jordan T; Liu, Qiang; Wang, Zhangyang; Ellington, Andrew D; Dimakis, Alexandros G; Klivans, Adam (February 2023, ICLR 2023 https://openreview.net/forum?id=YDJRFWBMNby)

The molecular basis of protein thermal stability is only partially understood and has major significance for drug and vaccine discovery. The lack of datasets and standardized benchmarks considerably limits learning-based discovery methods. We present \texttt{HotProtein}, a large-scale protein dataset with \textit{growth temperature} annotations of thermostability, containing K amino acid sequences and K folded structures from different species with a wide temperature range. Due to functional domain differences and data scarcity within each species, existing methods fail to generalize well on our dataset. We address this problem through a novel learning framework, consisting of () Protein structure-aware pre-training (SAP) which leverages 3D information to enhance sequence-based pre-training; () Factorized sparse tuning (FST) that utilizes low-rank and sparse priors as an implicit regularization, together with feature augmentations. Extensive empirical studies demonstrate that our framework improves thermostability prediction compared to other deep learning models. Finally, we introduce a novel editing algorithm to efficiently generate positive amino acid mutations that improve thermostability. Codes are available in https://github.com/VITA-Group/HotProtein.
more » « less
Full Text Available
Sanity Checks for Lottery Tickets: Does Your Winning Ticket ReallyWin the Jackpot?

Ma, Xiaolong; Yuan, Geng; Shen, Xuan; Chen, Tianlong; Chen, Xuxi; Chen, Xiaohan; Liu, Ning; Qin, Minghai; Liu, Sijia; Wang, Zhangyang; et al (December 2021, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available

Search for: All records